While deep learning succeeds in a wide range of tasks, it highly depends on the massive collection of annotated data which is expensive and time-consuming. To lower the cost of data annotation, active learning has been proposed to interactively query an oracle to annotate a small proportion of informative samples in an unlabeled dataset. Inspired by the fact that the samples with higher loss are usually more informative to the model than the samples with lower loss, in this paper we present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss. The core of our approach is a measurement Temporal Output Discrepancy (TOD) that estimates the sample loss by evaluating the discrepancy of outputs given by models at different optimization steps. Our theoretical investigation shows that TOD lower-bounds the accumulated sample loss thus it can be used to select informative unlabeled samples. On basis of TOD, we further develop an effective unlabeled data sampling strategy as well as an unsupervised learning criterion for active learning. Due to the simplicity of TOD, our methods are efficient, flexible, and task-agnostic. Extensive experimental results demonstrate that our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks. In addition, we show that TOD can be utilized to select the best model of potentially the highest testing accuracy from a pool of candidate models.
translated by 谷歌翻译
Recent deep learning methods have achieved promising results in image shadow removal. However, their restored images still suffer from unsatisfactory boundary artifacts, due to the lack of degradation prior embedding and the deficiency in modeling capacity. Our work addresses these issues by proposing a unified diffusion framework that integrates both the image and degradation priors for highly effective shadow removal. In detail, we first propose a shadow degradation model, which inspires us to build a novel unrolling diffusion model, dubbed ShandowDiffusion. It remarkably improves the model's capacity in shadow removal via progressively refining the desired output with both degradation prior and diffusive generative prior, which by nature can serve as a new strong baseline for image restoration. Furthermore, ShadowDiffusion progressively refines the estimated shadow mask as an auxiliary task of the diffusion generator, which leads to more accurate and robust shadow-free image generation. We conduct extensive experiments on three popular public datasets, including ISTD, ISTD+, and SRD, to validate our method's effectiveness. Compared to the state-of-the-art methods, our model achieves a significant improvement in terms of PSNR, increasing from 31.69dB to 34.73dB over SRD dataset.
translated by 谷歌翻译
The recent neural implicit representation-based methods have greatly advanced the state of the art for solving the long-standing and challenging problem of reconstructing a discrete surface from a sparse point cloud. These methods generally learn either a binary occupancy or signed/unsigned distance field (SDF/UDF) as surface representation. However, all the existing SDF/UDF-based methods use neural networks to implicitly regress the distance in a purely data-driven manner, thus limiting the accuracy and generalizability to some extent. In contrast, we propose the first geometry-guided method for UDF and its gradient estimation that explicitly formulates the unsigned distance of a query point as the learnable affine averaging of its distances to the tangent planes of neighbouring points. Besides, we model the local geometric structure of the input point clouds by explicitly learning a quadratic polynomial for each point. This not only facilitates upsampling the input sparse point cloud but also naturally induces unoriented normal, which further augments UDF estimation. Finally, to extract triangle meshes from the predicted UDF we propose a customized edge-based marching cube module. We conduct extensive experiments and ablation studies to demonstrate the significant advantages of our method over state-of-the-art methods in terms of reconstruction accuracy, efficiency, and generalizability. The source code is publicly available at https://github.com/rsy6318/GeoUDF.
translated by 谷歌翻译
Oriented normals are common pre-requisites for many geometric algorithms based on point clouds, such as Poisson surface reconstruction. However, it is not trivial to obtain a consistent orientation. In this work, we bridge orientation and reconstruction in implicit space and propose a novel approach to orient point cloud normals by incorporating isovalue constraints to the Poisson equation. Our key observation is that when using a point cloud with consistently oriented normals as the input for implicit surface reconstruction, the indicator function values of the sample points should be close to the isovalue of the surface. Based on this observation and the Poisson equation, we propose an optimization formulation that combines isovalue constraints with local consistency requirements for normals. We optimize normals and implicit functions simultaneously and solve for a globally consistent orientation. Thanks to the sparsity of the linear system, our method can work on an average laptop with reasonable computational time. Experiments show that our method can achieve high performance in non-uniform and noisy data and manage varying sampling densities, artifacts, multiple connected components, and nested surfaces.
translated by 谷歌翻译
组织病理学图像包含丰富的表型信息和病理模式,这是疾病诊断的黄金标准,对于预测患者预后和治疗结果至关重要。近年来,在临床实践中迫切需要针对组织病理学图像的计算机自动化分析技术,而卷积神经网络代表的深度学习方法已逐渐成为数字病理领域的主流。但是,在该领域获得大量细粒的注释数据是一项非常昂贵且艰巨的任务,这阻碍了基于大量注释数据的传统监督算法的进一步开发。最新的研究开始从传统的监督范式中解放出来,最有代表性的研究是基于弱注释,基于有限的注释的半监督学习范式以及基于自我监督的学习范式的弱监督学习范式的研究图像表示学习。这些新方法引发了针对注释效率的新自动病理图像诊断和分析。通过对130篇论文的调查,我们对从技术和方法论的角度来看,对计算病理学领域中有关弱监督学习,半监督学习以及自我监督学习的最新研究进行了全面的系统综述。最后,我们提出了这些技术的关键挑战和未来趋势。
translated by 谷歌翻译
综合虚拟人类及其3D环境之间的自然相互作用对于众多应用程序(例如计算机游戏和AR/VR体验)至关重要。我们的目标是使人类与给定的3D场景进行互动,该场景由高级语义规格控制为动作类别和对象实例,例如“坐在椅子上”。将相互作用语义纳入生成框架中的主要挑战是学习一个共同表示,该表示有效地捕获了异质信息,包括人体的关节,3D对象几何以及相互作用的意图。为了应对这一挑战,我们设计了一种基于变压器的新型生成模型,其中铰接的3D人体表面点和3D对象共同编码在统一的潜在空间中,并且人与物体之间的相互作用语义是通过嵌入的。位置编码。此外,受到人类可以同时与多个对象相互作用的相互作用的组成性质的启发,我们将相互作用语义定义为不同原子动作对象对的组成。我们提出的生成模型自然可以结合不同数量的原子相互作用,从而无需复合相互作用数据即可合成组成的人类习惯相互作用。我们使用交互语义标签和场景实例分割扩展了Prox数据集,以评估我们的方法,并证明我们的方法可以通过语义控制生成现实的人类场景相互作用。我们的感知研究表明,我们合成的虚拟人类可以自然与3D场景相互作用,从而超过现有方法。我们将方法硬币命名,用于与语义控制的组成相互作用合成。代码和数据可在https://github.com/zkf1997/coins上获得。
translated by 谷歌翻译
通过将熵编解码器应用于学习的数据分布,神经压缩机在压缩比方面显着优于传统编解码器。但是,神经网络的高推断潜伏期阻碍了实际应用中神经压缩机的部署。在这项工作中,我们提出了仅整数离散流(IODF),这是一种具有仅整数算术的有效神经压缩机。我们的工作建立在整数离散流的基础上,该流程包括离散随机变量之间的可逆转换。我们提出了基于8位量化的纯整数算术的有效可逆转换。我们的可逆转换配备了可学习的二进制门,以在推理过程中去除冗余过滤器。我们在GPU上使用Tensorrt部署IODF,与现有最快的神经压缩机相比,达到10倍推理的速度,同时保留了Imagenet32和Imagenet64上的高压缩率。
translated by 谷歌翻译
使用神经网络代表3D对象已变得流行。但是,许多以前的作品采用具有固定体系结构和大小的神经网络来表示不同的3D对象,这导致简单对象的网络参数过多,并且对复杂对象的重建精度有限。对于每个3D模型,希望拥有尽可能少的参数以实现高保真重建的端到端神经网络。在本文中,我们提出了一种利用神经体系结构搜索(NAS)和二进制分类的高效体素重建方法。以层数,每一层的节点数量以及每一层的激活函数为搜索空间,可以根据强化学习技术获得特定的网络体系结构。此外,为了摆脱网络推理后使用的传统表面重建算法(例如,行进立方体),我们通过对二进制体素进行分类来完成端到端网络。与其他签名的距离字段(SDF)预测或二进制分类网络相比,我们的方法使用更少的网络参数获得了更高的重建精度。
translated by 谷歌翻译
近年来,神经网络授权的演员 - 评论家(AC)算法具有重大的经验成功。然而,AC算法的大多数现有的理论支持集中于线性函数近似或线性化神经网络的情况,其中特征表示在整个训练中都是固定的。这种限制未能捕获神经AC中的表示学习的关键方面,这在实际问题中是关键的。在这项工作中,我们采取了一种含义的基于特征神经交流的演变和融合的视角。具体而言,我们考虑一个AC的版本,其中Actor和批评者由过度分辨率的双层神经网络表示,并以两时间测定的学习速率更新。批评评论批评者通过时间差异(TD)学习使用较大的步骤,而演员通过近端策略优化(PPO)更新,具有较小的步骤。在连续时间和无限宽度限制性方案中,当时间尺度适当分开时,我们证明了神经通讯以Sublinear率找到全球最佳政策。此外,我们证明了批评网络引起的特征表示允许在初始概念的邻域内发展。
translated by 谷歌翻译
预先接受的语言模型实现了最先进的导致各种自然语言处理(NLP)任务。 GPT-3表明,缩放预先训练的语言模型可以进一步利用它们的巨大潜力。最近提出了一个名为Ernie 3.0的统一框架,以预先培训大型知识增强型号,并培训了具有10亿参数的模型。 Ernie 3.0在各种NLP任务上表现出最先进的模型。为了探讨缩放的表现,我们培养了百卢比的3.0泰坦参数型号,在PaddlePaddle平台上有高达260亿参数的泰坦。此外,我们设计了一种自我监督的对抗性损失和可控语言建模损失,以使ERNIE 3.0 TITAN产生可信和可控的文本。为了减少计算开销和碳排放,我们向Ernie 3.0泰坦提出了一个在线蒸馏框架,教师模型将同时教授学生和培训。埃塞尼3.0泰坦是迄今为止最大的中国密集预训练模型。经验结果表明,Ernie 3.0泰坦在68个NLP数据集中优于最先进的模型。
translated by 谷歌翻译